Heritable transcriptional defects from aberrations of nuclear architecture

Transcriptional heterogeneity due to plasticity of the epigenetic state of chromatin contributes to tumour evolution, metastasis and drug resistance1–3. However, the mechanisms that cause this epigenetic variation are incompletely understood. Here we identify micronuclei and chromosome bridges, aberrations in the nucleus common in cancer4,5, as sources of heritable transcriptional suppression. Using a combination of approaches, including long-term live-cell imaging and same-cell single-cell RNA sequencing (Look-Seq2), we identified reductions in gene expression in chromosomes from micronuclei. With heterogeneous penetrance, these changes in gene expression can be heritable even after the chromosome from the micronucleus has been re-incorporated into a normal daughter cell nucleus. Concomitantly, micronuclear chromosomes acquire aberrant epigenetic chromatin marks. These defects may persist as variably reduced chromatin accessibility and reduced gene expression after clonal expansion from single cells. Persistent transcriptional repression is strongly associated with, and may be explained by, markedly long-lived DNA damage. Epigenetic alterations in transcription may therefore be inherently coupled to chromosomal instability and aberrations in nuclear architecture.


Heritable transcriptional defects from aberrations of nuclear architecture
Stamatis Papathanasiou 1,2,3# , Nikos A. Mynhier 1,2* , Shiwei Liu 2,4* , Gregory Brunette 1,2 , Ema Stokasimov 1,2 , Etai Jacob 5,6,7 , Lanting Li 5,6,8 , Caroline Comenho 9,10,11 , Bas van Steensel 12 , Jason D. Buenrostro 9,10,11 , Cheng-Zhong Zhang 5,6,8# , David Pellman 1,2,5,9,13#  Cells for which RNA-Seq data have been generated are grouped by the experimental design (Column A) and the identity of the ancestor cell (Column B: "Family ID"). Each family consists of cells descended from a single ancestor as identified by live cell imaging and each family member is assigned a unique ID (Column C: "Cell ID"). Columns D-L summarize various quality metrics of each single-cell library that are generated by STAR. Control RPE-1 cells (untreated,FACS, are included if they have > 6,000 genes with five or more reads; MN related cells (Look-Seq and Look-Seq2, generation 1 and generation 2) are included if they have > 4,000 genes with five or more reads. Column M displays the sequencing platform on which the RNA-Seq data were generated. Column N reports which cells were used in our control cell panel throughout the computational analysis. Our new allelic transcription analysis revealed more than three pre-existing chromosome/armlevel copy-number changes in two families, F73 and F79. We therefore excluded these two families from the final summary. In both Tab 3 and Tab 4, chromosomes with non-reference transcription but inferred to be not related to micronucleation are shaded in gray. These include (1) pre-existing duplications that are shared by all family members; (2) mis-segregation events between family members (resulting in 0:2 copy number imbalance); and (3) other transcriptional variation patterns that do not match the predicted outcomes shown in Extended Data Fig. 2a.

Supplementary Videos 1 and 2. Examples of Look-Seq2 experiments.
The videos start during generation 1. An MN cell (GFP-H2B channel) and its sister are

Supplementary Video 4. Formation of an MN-body marked by SNAP-MDC1 -example 1.
In generation 1, a micronucleus forms and undergoes NE rupture (loss of RFP-NLS). The damaged chromosome from the micronucleus is only decorated with SNAP-MDC1 during mitosis. In generation 2, the SNAP-MDC1 marked chromosome is reincorporated into a daughter nucleus, forming a long-lived MN-body.

Supplementary Video 5. Formation of an MN-body marked by SNAP-MDC1 -example 2.
Similar to the Supplementary Video 4. Note that the MN has already undergone rupture at the start of the video.

Supplementary Video 6. Formation of an MN-body marked by SNAP-MDC1 -example 3.
Similar to the Supplementary Video 4. Note that the MN has already undergone rupture at the start of the video and that in this example only one of the two reincorporated micronuclei appear to form an MN-body.

Quantification of transcriptional changes relative to normal disomic transcription
For a list of genes with expression in a single cell (a) given by TPM (a) i and reference expression (mean expression in control cells) given by 〈TPM i 〉, we consider a weighted average TPM ratio of where T (a) i is the normalized expression of gene i in cell (a), α i is the weight for gene i . If we assume T (a) i of different genes to be independent, the variance of this weighted TPM ratio is given by Here Var(T i ) is the variance of TPM ratio for gene i that is estimated from control cells. It is straightforward to verify that the variance in Eq. (2) is minimized when We weighed individual TPM ratios T (a) i by the inverse of its variance estimated from control RPE-1 cells (198 total). To ensure numerical stability of the weighted average, we capped the weights α i at the 95% of all weights across the genome (excluding those from Chr.X).
We used a similar strategy to calculate the average allelic fraction over multiple genes (both in 10Mb intervals and across each chromosome) as where AF (a) i denotes the allelic fraction of gene i in cell (a) and β i is the weight. We also capped the weights β i at the 95% of all weights across the genome (excluding those from Chr.X). For Chr.X transcripts with a predominant Xa (active X) bias, we calculated the allelic fraction as a simple average.

Cell-specific TPM normalization
As TPM represents relative transcript abundance, the TPM values in a single cell are uniformly amplified or attenuated by up-or down-regulation of one or a few highly transcribed genes that significantly alter the total mRNA content. To account for such global changes, we performed an additional global normalization of the TPM ratios in each cell: The global scaling factor g (a) was introduced to normalize the inverse variance-weighted mean TPM ratio to unity in each cell (a), i.e., Therefore, We used the scaled TPM ratios t i for the average TPM ratio calculation. For most control cells, the scaling factor g (a) is close to 1; we therefore did not re-calibrate the mean 〈TPM i 〉 and variance Var(T i ) of TPM values calculated from control cells.